Seybold  Special  Report 


Volume  2,  Number  3 


N  CONTRAST  WITH  last  year,  whien 
the  electronic  document  delivery 
market  seemed  suddenly  to  explode 
with  new  possibilides  and  lots  of  confiision, 
this  year  was  one  of  technology  maturation 
and  gradual  customer  implementations  that 
are  providing  a  body  of  reference  from  which 
others  can  learn.  However,  despite  some 
advancements,  the  market  is  still  coniiising. 

In  the  past  12  months,  page-turning 
technology  for  easily  transmitting  single  doc- 
uments electronically  has  gone  from  the  pre- 
view stage  to  shrink-wrapped,  low-cost 
commercial  products  available  at  retail  com- 
puter outiets.  In  the  field,  page-turners  such 
as  Adobe  Acrobat  and  No  Hands  Common 
Ground  are  providing  real  cost  and  produc- 
tivity benefits,  particularly  for  print-on- 
demand  applications. 

At  the  same  time  that  distribution  of 
single  documents  on  an  ad-hoc  basis  is  be- 
coming a  commodity  technology,  delivery 
and  retrieval  of  documents  from  ekctronic 
lihmries  is  becoming  more  of  a  puzzle.  Now 
that  it  is  becoming  easier  and  easier  to  ar- 
chive compound  documents,  along  with  text, 
in  an  electronic  repository,  how  do  we  go 
about  organizing  that  repository?  In  what 
data  formats  should  the  documents  be  saved? 
Wliat  kinds  of  tools  \vill  users  need  to  locate 
documents  in  a  collection?  Are  diose  tools 
different  for  CD-ROM  and  so-called  media 
servers?  How  should  documents  be  present- 
ed to  users  for  browsing  or  retrieval?  If  I'm 
a  commercial  publisher,  how  do  I  charge  for 
accessing  documents?  How  do  I  maintain 
copyright  protection? 

The  answers  to  these  and  myriad  of  other 
related  questions  are  changing  so  quicldy  diat 
to  make  a  cohesive  picture  turns  out  to  be 
more  like  building  an  ongoing  collage  than 
painting  a  landscape.  We  understand  the  dif- 
ficulty that  poses  when  reading  the  pieces  that 


follow  on  their  own,  without  seeing  the  rest 
of  the  pieces  needed  to  get  a  complete  pic- 
ture. But  to  write  up  what  is  happening  in 
this  market  from  one  vie-wpoint,  as  if  it  could 
be  told  completely  as  a  cohesive  story,  would 
not  be  a  true  reflection  of  what  is  going  on. 
It  may  be  completely  unnerving,  but  for  now 
it  is  best  just  to  try  to  stay  current  with  the 
events;  the  overall  picture  is  going  to  take 
time  to  emerge. 


Issues  aiid  Trends 


From  searching  to 
knowledge  extraction 

As  it  becomes  easier  to  file  documents  elec- 
tronically, it  is  becoming  harder  to  locate 
relevant  information  in  the  archives.  As  in  a 
conventional  Hbrary,  an  electronic  library  can 
catalog  its  items  according  to  classification 
schemes  and  indexes.  But  one  problem  of 
paper  filing  systems  and  ordinary  libraries  also 
applies  to  electronic  repositories — once  the 
archive  becomes  large,  it  becomes  very  diffi- 
cult to  locate  information  by  topics  that  have 
not  been  indexed  ahead  of  time. 

Is  there  any  way  to  do  a  better  job  of 
extracting  knowledge  from  electronic  archives 
than  paper  ones?  This  question  was  posed  by 
moderator  Carl  Frappoalo  at  the  beginning 
of  the  session  "Extracting  Knowledge  from 
Document  Archives."  By  the  end  of  the  ses- 
sion, the  answer  was  a  tentative  yes,  but  not  to 
the  degree  diat  people  would  like  or  expect. 

The  panel  was  made  up  of  three  ven- 
dors of  text-retrieval  engines — ConQuest, 
Fulcrum  and  Personal  Library  Software 
(PLS) — along  with  John  Dawes  of  Adobe, 
representing  Acrobat  and  Verity 


Improving  upon  boolean  logic.  The  ven- 
dors all  discussed  ways  that  they  have  im- 
proved the  ease  of  making  queries,  the 
precision  of  those  queries  and  the  flexibilit)' 
in  running  queries  against  multiple  kinds  of 
information.  Vendors  of  both  CD-ROM  and 
server-based  products  are  adding  a  variet)'  of 
querying  tools  beyond  conventional  boolean 
queries.  These  methods  include: 

1.  Relevance  ranking.  Most  vendors  now 
offer  some  means  of  sorting  the  hit  list 
that  results  fi-om  a  query  according  to  how 
relevant  the  system  thinks  the  material  is 
to  the  query.  In  a  simplest  sense,  this  in- 
volves sorting  by  the  greatest  number  of 
hits  within  a  document.  Some  vendors 
apply  a  statistical  model  to  derive  rele- 
vance. A  more  advanced  feature  is  to  show 
a  graphical  representation  of  the  cluster- 
ing of  the  hits,  which  immediately  gives 
the  user  feedback  as  to  what  sections  or 
documents  are  particularly  relevant. 

2.  Lexical  help.  iVlany  vendors  also  extend 
queries  with  the  use  of  word  stems,  the- 
sauruses  and  dictionaries.  With  these  tools, 
a  simple  query  can  be  extended  to  include 
other  related  words. 

3.  Concept  searches.  Some  systems  let  you 
group  words  into  user-defined  concepts 
or  topics.  For  example,  you  might  want  a 
topic  query,  called  nafta,  to  look  for  doc- 
uments that  contain  Mexico  or  Canada 
within  a  paragraph  of  ttade  or  tariff  Con- 
Quest  provides  thousands  of  concepts  out 
of  the  box;  others,  such  as  Verity,  prowde 
a  way  for  users  to  create  their  own  topics. 

4.  Natural-langnage  queries.  Because  bool- 
ean queries  are  hard  to  write,  some  ven- 
dors have  developed  ways  to  write  queries 
in  English  or  other  languages  that  people 
speak,  rather  than  asking  people  to  learn 
the  language  of  the  computer.  The  sys- 
tem then  takes  care  of  translating  the 
natural  language  query  into  the  language 
of  die  software. 

5.  Qttery  by  example.  A  quantum  leap  for- 
ward in  fiill-text  retrieval  is  the  notion  of 
querying  by  example.  In  diis  method,  the 
user  does  not  formulate  a  query  at  all.  AU 
you  have  to  do  is  swipe  a  selection  of  text 
and  tell  the  system  to  "find  me  more  doc- 
uments Itlce  this." 

6.  Mixing  of  full-text  with  fielded  searches. 
Fulcrum  has  built  into  its  latest  product 
the  concept  of  writing  a  single  query  that 
runs  against  both  an  SQL,  fielded  database 
and  a  fiill-text  repository.  This  mixture  of 
fiill-text  and  SQL  queries  represents  die  first 
step  in  what  will  undoubtedly  be  com- 
plex APIs  for  running  queries  against  all 
kinds  of  information,  text,  image,  video, 
financial,  and  so  forth. 

Despite  these  advances,  fiall-text  retriev- 
al has  severe  limitations  for  finding  informa- 
tion when  you  are  not  sure  what  documents 


25 


November  29,  1993 


Seybold  San  Francisco  '93,  Part  II 


Precision  =^^^-"^"^-R-^'-^-'^ 
-  Total  Reievant 


Precision 


Recall 


Recall  = 


Relevant  Retrieved 
Total  Relevant 


30 

Documents  Retrieved 


Algorithms  aren't  enougli.  Matt  IColl  of  PLS  presented  this  graph, 
which  slrows  that  witli  full-text  retrieval  systems,  the  more  documents 
your  query  locates  the  lower  the  percentage  of  hits  that  are  actually 
relevant  to  the  search.  His  point  was  tliat  for  real  advances  m  retrieval 
to  happen,  vendors  wUl  have  to  focus  on  human  interaction  and  as- 
pects of  retrieval  other  than  search  algorithms. 


you  are  hunting  for.  The  information  repos- 
itory may  contain  relevant  items,  but  the  best 
flill-text  retrieval  can  do  in  most  cases  is  find 
25%  of  die  relevant  information.  (ConQuest 
claimed  that  it  could  get  that  number  closer 
to  50%;  iVIattiiew  Koll  of  PLS,  which  has  had 
much  more  experience  with  very  large  ar- 
chives, disputed  that  claim.  We'd  note  diat 
even  50%  is  not  great,  especially  if  the  crucial 
documents  are  in  the  half  not  found.) 

The  upshot  is  that  fiill-text  retrieval  is  a 
big  improvement  over  a  pure  keyword/top- 
ic/ author/ title  index,  such  as  you  get  in  the 
library  Because  of  fiill-text  retrieval,  you  can 
locate  information  in  the  repository  that 
would  otherwise  be  overlooked.  But  because 
we  as  humans  tend  to  change  our  questions 
(sent  to  the  system  as  queries)  as  we  start  to 
get  answers  (looking  at  the  hit  list  and  sam- 
pling documents),  die  percentage  of  recall 
and  precision  never  looks  great,  because  we 
are  giving  the  system  a  moving,  rather  dian 
fixed,  target. 

Na\igational  aids.  One  aspect  die  speak- 
ers missed  that  was  very  evident  on  the  show 
floor  is  the  use  of  navigational  aids  to  supple- 
ment fiiU-text  rettieval.  In  systems  designed 
for  delivering  or  housing  lar-ge  information 
collections,  navigational  tools  are  provmg  to 
be  a  usefiil  supplement  to  retrieval  by  que- 
ries. 

1.  Logical  collections.  The  first,  most  obvi- 
ous idea  is  to  group  similar  documents 
together  so  diat,  for  example,  going  to 
the  electronic  file  cabinet  of  monthly  sales 
reports  and  opening  the  drawer  for  those 
fi'oni  the  Western  sales  region  shows  you 
the  Western  sales  reports.  This  idea  can 
be  extended  to  topical  collections.  Super- 
Book  uses  the  metaphor  of  a  bookcase 
and  bookshelves,  where  each  shelf  might 


be  a  subject,  similar-  to 
the  way  libraries  orga- 
nize their  shelves. 
2  Structiiral  naviga- 
tion. It  is  all  well  and 
good  to  locate  docu- 
ments that  contain 
words  or  phrases  of 
interest,  but  for  large 
documents  you  also 
want  some  notion  of 
where  you  are  in  the 
document  and,  at  the 
same  time,  you  want  a 
tool  for  navigating  ac- 
cording to  the  contents, 
rather  than  simply 
jumping  to  the  next  hit 
or  scroUing  through  a 
humongous  text  file. 
SuperBook  is  a  useful 

  example  here,  too.  The 

product  pioneered  the 
concept  of  showing  the  user  a  collapsible/ 
expandable  outline,  with  fiill-text  hits 
shown  against  each  element  of  the  out- 
line. EBT  also  uses  this  technique  in  Dy- 
naText.  Many  other  vendors  provide 
structural  navigation  widiout  the  link  to 
fiill-text  liits. 

3.  Random  hyperlinks.  Ted  Nelson  pioneered 

the  notion  of  nonlinear  navigation,  or  hy- 
pertext, nearly  20  years  ago.  Today,  be- 
cause the  links  often  go  to  information 
other  than  just  text,  we  may  call  them 
h)'perlinl<:s,  but  the  idea  is  still  the  same — 
to  create  the  electronic  equivalent  of  cross- 
references  in  paper.  Such  links  may  be  to 
material  diat  is  not  found  in  a  fiill-text 
search,  or  it  may  be  outside  of  the  do- 
main of  the  text-retrieval  system  (as  is 
currentiy  the  case  with  images  and  video). 
In  electronic  repositories,  it  can  be  much 
easier  to  follow  links,  because  the  software 
will  take  the  user  direcdy  to  the  referenced 
item,  rather  than  asking  users  to  pursue 
±ose  cross-references  on  their  own. 

4.  Directed  paths.  Random  links  are  OK  for 
simple  cross-references,  but  typically  the 
user  has  no  idea  where  the  linlcs  are  in  the 
repository  ahead  of  time.  Witii  a  product 
such  as  Acrobat,  for  example,  you  can  have 
any  number  of  pages  linked  to  each  other, 
but  there  is  no  way  to  extract  a  series  of 
links  into  a  path  that  someone  else  might 
want  to  follow.  The  notion  of  directed 
paths,  as  seen  in  a  product  such  as  West- 
inghouse  Pathways,  is  one  we  think  will 
prove  very  usefiil  for  large  document  col- 
lections. It  will  first  be  used  by  editors  who 
have  control  over  a  collection.  For  exam- 
ple, an  editor  of  documentation  manuals 
might  want  to  establish  different  paths  for 
diagnosing  and  correcting  a  problem  ac- 
cording to  the  skill  level  of  the  reader.  The 


editor  of  an  electronic  encyclopedia  might 
want  to  establish  a  path  that  relates  to  a 
subject,  such  as  jet  propulsion,  in  wliich  a 
story  is  told  through  a  series  of  articles 
and  screens  that  may  or  may  not  have  the 
word  jet  propulsion  in  them. 

Everybody  offers  everything.  Once  you 
move  beyond  electronic  delivery  of  single 
documents  (that  is  past  products  like  No 
Hands  Common  Ground  and  Farallon  Rep- 
lica) there  are  tew  easy  distinctions  to  draw 
among  the  access  methods  of  different  prod- 
ucts. Some  or  all  of  these  retrieval  and  nav- 
igational methods  may  be  present  in  any 
product  that  handles  retrieval  of  multiple 
documents.  As  electronic  libraries  become 
more  prevalent,  and  the  modul'arity  of  elec- 
tronic access  increases,  we  will  undoubtedly 
see  ways  to  combine  all  sorts  of  retrieval  tools 
with  the  leading  archive  formats. 

There  is  no  longer  a  clear  market  dis- 
tinction between  tools  aimed  at  commercial 
publishers  and  those  designed  for  inhouse  use. 
At  one  time,  CD-ROM  authoring  tools  were 
aimed  primarily  at  commercial  publishers. 
Today  half  of  the  cd-roms  produced  are 
created  by  inhouse  publishers,  and  it  is  be- 
coming increasingly  easy  for  them  to  make 
these  discs  direcdy  fi-om  PostScript,  SGML  and 
several  page  makeup  formats,  as  well  as  with 
the  more  advanced  tools  for  CD  authoring. 
With  the  advent  of  digital  highways,  ser\'er- 
based  products  such  as  SuperBook  or  Oracle- 
Book  become  viable  for  commercial 
publishing  as  well  as  the  inliouse  use  fi-om 
which  they  originated.  Obviously,  as  we'll 
discuss  in  a  moment,  the  copyright  and  in- 
tellectual property  concerns  of  commercial 
publishers  are  often  different  from  those  of 
inhouse  publishers.  But  people  on  both  sides 
will  be  using  similar  tools  for  delivering  and 
retrieving  information. 

Maps  of  tlie  paths  to  laiowledge.  As  we 

gam  more  experience  in  electronic  archives, 
it  is  quite  possible  that  users  will  begin  to 
make  directed  paths.  In  the  same  conference 
session,  the  fiiU-text  vendors  argued  that  sav- 
ing queries  is  not  very  helpfiil  because  it  is 
too  hard  to  say  who  the  expert  query-makers 
should  be.  And  if  everyone  saves  thek  que- 
ries, before  long  there  are  too  many  for  users 
to  know  which  to  choose.  We  agree  that  just 
saving  queries  randomly  in  a  pile  is  not  help- 
fill. 

But  we  disagree  that  past  experience  has 
no  relevance  to  fiiture  searches.  Surely  the 
feedback  from  those  crossing  the  Rocky 
Mountains  proved  usefiil  to  those  who  fol- 
lowed m  their  path.  If  you  were  looking  for 
a  good  place  to  mine  or  farm,  then  feedback 
such  as  that  path  leads  to  a  vast  desert  could 
save  you  a  lot  of  wasted  time  and  energy.  This 
is  just  as  true  today,  when  we  ask  a  librarian 


26 


Seybold  Special  Report 


Volume  2,  Number  3 


where  to  begm  looking  for  information  about 
a  topic  in  the  vast  repositories  that  the  li- 
brary offers  access  to.  If  you  want  to  know 
where  to  look  in  an  electronic  arcliive  for  an 
explanation  of  cost  accounting,  you  might 
begin  by  asldng  the  librarian  what  path  to 
follow,  rather  than  expecting  some  presaved 
query  on  cost-accounting  to  be  the  most 
relevant. 

There  are  often  several  patlis  one  can 
take  in  looking  for  information,  just  as  there 
are  many  ways  to  travel  and  many  roads  to 
drive  on.  When  we  want  to  know  how  to 
drive  from  Cincinnati  to  Columbus  we  can 
ask  directions  or  buy  a  map.  You  might  take 
the  scenic  route;  maybe  you  choose  the  in- 
terstate highway;  the  map  indicates  that  not 
all  roads  (hyperlinks)  are  the  same. 

Similarly,  we  expect  that  vidthin  an  in- 
formation collection,  there  are  certain  refer- 
ence points  that  will  be  used  more  often  than 
others.  Obviously  the  major  reference  points, 
just  like  cities  in  our  travel  analog)',  wiU  be 
destinations  for  many  more  paths  than  the 
obscure  tides.  We  can  imagine  that  if  visual 
na\dgation  aids  were  available,  they  might  help 
us  figure  out  in  what  way  we  intend  to  inves- 
tigate the  electi'onic  library. 

To  mal<;e  this  work,  users  will  need  tools 
for  naming,  saving,  annotating  and  visually 
representing  paths  through  the  collection.  As 
we  mentioned,  titese  tools  will  arrive  first  for 
audiors  and  editors.  We  expect  that  the  feed- 
back they  give  vendors  should  be  usefiil  in 
developing  tools  for  more  general-purpose 
use. 

Eventually,  with  navigation  methods 
such  as  these,  combined  with  full-text  retrieval 
and  (perhaps  most  important  of  all)  tools  for 
user  interaction,  we  may  one  day  have  an 
interface  for  exti'acting  knowledge  from  elec- 
tronic archives. 


Digital  highways 

High-bandwidth  telecommunication  is 
changing  both  the  production  and  the  dis- 
tribution of  information — and  the  rate  of 
change  is  accelerating  with  the  continual  in- 
troduction of  fiber  optics,  digital  phone 
switches  and  so  forth.  Some  of  the  changes 
simply  make  ordinary  data  transfers  go  fast- 
er. In  1976,  high-speed  data  communication 
meant  a  300-bps  modem.  Since  then,  mo- 
dems have  gotten  faster;  the  latest  models 
(using  V.FAST  protocols)  can  go  up  to  38 
kbps.  But  that's  about  as  far  as  tone-based 
signaling  can  be  pushed.  To  go  taster,  it  is 
necessary  to  turn  to  pure  digital  techniques. 
For  this  reason,  many  of  the  boodis  on  the 
Expo  floor  were  using  ISDN  phone  lines  (fiir- 
nished  by  Pacific  Bell)  diis  year,  where  a  year 
ago  they  would  have  been  using  modems  and 
voice -grade  lines. 


There  are  a  lot  of  good  analogies  be- 
tween phone  lines  and  roads,  starting  with 
the  fact  that  both  are  communication  sys- 
tems in  the  broadest  sense.  Bodi  are  costiy 
to  biuld  and  maintain,  but  are  taken  for  grant- 
ed by  their  users,  at  least  until  something  goes 
wrong.  Odier  parallels  include; 
»  Failures.  It  pays  to  build  a  redundant  net- 
work, with  more  than  one  way  to  connect 
any  two  points,  as  a  defense  against  inev- 
itable brealcdowns. 
9  BoUleneclis.  As  soon  as  you  try  to  ease  a 
bottieneck  by  upgrading  the  weakest  com- 
ponent of  the  system,  the  bottieneck  moves 
to  the  next-weakest  part.  In  a  road  sys- 
tem, the  bottieneck  often  occurs  because 
the  on-ramps  and  access  roads  aren't  as 
well  engineered  as  the  fi'eeway.  Digitally, 
the  same  problem  occurs  at  the  interface 
between  the  high-speed  backbone  network 
and  the  slower  local  loops. 
9  Peak  loads.  There  are  two  ways  to  cope 
widi  peak  demand:  design  the  system  to 
handle  the  peaks  (which  means  high  costs 
and  complexit)',  along  widi  underutiliza- 
tion  most  of  the  time)  or  flatten  the  peaks 
by  maldng  the  users  queue  up  for  service 
(which  leads  to  traffic  jams,  complaints  and 
failure  to  make  deadlines). 
•  Trajfic  jams.  The  availabilit)'  of  bandwidth 
generates  new  uses  tor  it,  so  that  any  chan- 
nel eventually  fills  up. 

RestmictuMng  production.  Brian  Ander- 
son, of  Contex  Systems,  described  how  a 
newspaper  group  based  in  Sydney,  Australia, 
uses  [SDN  to  move  data  from  the  editorial 
offices  on  one  side  of  town  to  the  printing 
plant  on  the  other  side.  The  link  between  die 
two  sites  was  painless  to  create:  buy  the  box- 
es (a  network  router  and  a  line  interface  at 
each  end),  order  lines  from  the  phone  com- 
pany, plug  everything  in  and  pay  the  month- 
ly line  tees. 

In  operation,  the  link  is  quite  transpar- 
ent. At  each  site,  the  file  servers  at  the  other 
end  of  the  line  behave  the  same  as  the  local 
servers,  albeit  somewhat  slower.  Nor  is  there 
a  difference  in  access  techniques  or  user  inter- 
face. Users  running  Windows  or  Macs  send 
their  files  across  town  by  simply  dragging  icons. 

The  cost  is  less  than  sending  disks  across 
town  by  couriers  on  motorbikes:  about  $9 
per  day  for  dedicated  lines  up  to  25  kilome- 
ters long  and  rated  at  256  kbps. 

The  lines  have  enabled  a  significant 
change  in  production  workflows.  In  the  old 
days,  a  publisher  would  deal  with  a  service 
bureau  for  outputting  films,  a  trade  separa- 
tor for  scanning  color  images  and  a  printer. 
Now,  the  scanning  is  done  inhouse  with  desk- 
top drum  scanners  and  ftilly  made-up  pages 
are  ti-ansmitted  as  PostScript  files  to  the  print- 
er, who  outputs  final  films  for  the  press.  Two 
trade  suppliers  are  no  longer  needed. 


Extinctioii  ot  opportunity?  Spealdng  on 
behalf  of  Business  Link,  a  New  York  Cit)' 
service  bureau,  Todd  Melet  opined  that  digital 
liighways  need  not  meait  bankruptcy  for  trade 
specialists.  Rather,  they  might  have  the  op- 
posite effect  of  opening  new  markets,  allow- 
ing the  specialist  to  serve  customers  at  much 
greater  distances  than  before.  For  example, 
Business  Link  uses  ISDN  and  Switch-56  data 
links  to  serve  roughly  a  thousand  ad  agen- 
cies and  designers.  iVIost  are  in  New  York, 
but  he  has  some  customers  clear  across  the 
country. 

Melet  noted  that,  while  a  daily  news- 
paper could  easily  justify  a  dedicated  line 
between  its  editorial  and  printing  sites,  most 
design  shops  don't  do  enough  business  with 
any  one  supplier  to  justify  the  equipment  cost. 
However,  ISDN  and  Switch-56  are  dial-up  ser- 
vices; you  can  call  up  any  properly  equipped 
phone  whenever  you  need  to.  The  tiick  is  to 
develop  applications  that  go  beyond  simply 
sending  pages.  Ordering  images  from  a  stock- 
photo  dealer,  placing  display  ads  in  news- 
papers, interactive  design  conferences, 
interconnecting  LANs  and  even  telecommut- 
ing from  an  at-home  office  are  all  possible 
now.  At  today's  prices — $40/month  in  New 
York  Cit)' — it  doesn't  take  very  many  such 
uses  to  tip  die  scale  in  favor  of  ISDN  now. 


WAIS  on  the  Internet.  Publishers  who 
are  looidng  for  an  electronic  outiet  for  dicir 
information  can  use  an  existing  distribution 
medium;  the  Internet.  This  is  a  web  of  about 
50,000  commercial,  governmental  and  aca- 
demic networks,  encompassing  1.7  million 
host  computers  in  91  countries.  It  used  to 
be  restricted  to  non-commercial  traffic,  a 
legacy  of  its  origins  as  an  academic  research 
project  fiinded  by  the  U.S.  government.  Now, 
however,  commercial  traffic  (carried  over  non- 
government-flmded  wires)  is  the  tastest-gi'ow- 
ing  component  of  the  Internet. 

To  help  publishers  seeldng  electronic 
oudets,  WAIS  (Wide  Area  Information  Serv- 
ers) was  founded  as  a  for-profit  corporation 
(see  Digital  Media,  Vol.  1,  No.  9).  It  has  de- 
fined standard  server  protocols  for  searching 
text  and  image  databases,  and  it  offers  server 
technology  and  access  tools  to  its  customers. 

The  result:  An  information  provider  now 
can  have  a  global  market,  reaching  many 
more  potential  customers  than  would  be 
possible  with  a  private  service.  In  his  confer- 
ence presentation,  WAIS  director  John 
Duhring  described  the  experience  of  Coun- 
terpoint Publishing,  which  handles  the  on- 
line version  of  Commerce  Business  Daily. 
Initially,  Counterpoint  had  a  line-oriented  user 
interface  and  direct  dial-up  access  to  its  serv- 
er, and  typically  transmitted  about  2,000 
documents  per  month  to  subscribers.  It  dien 
adopted  the  Gopher  user  interface  (developed 
at  the  University  of  Minnesota,  home  of  die 


27 


November  29,  1993 


Seyboid  San  Francisco  '93,  Part  11 


Alphabet  Soup,  Telco  Style 

Believe  it  or  not,  tine  telephony  business  has  even  more  acronyms  than  the  computer 
industry.  Isdn  stands  for  integrated  sen/ices  digital  networi<,  and  for  practical  purposes, 
it  means  one  twisted-pair  circuit  carrying  data  at  64k  bits  per  second.  You  can  gang 
multiple  circuits  together  to  obtain  higher  speeds.  By  digitizing  the  sound,  one  circuit 
is  sufficient  for  two  separate  audio  conversations  plus  some  9,600-bps  data  on  the 
side.  This  is  how  it  is  being  marketed  for  at-home  offices,  for  example. 

ATM,  in  this  business,  has  nothing  to  do  with  cash  machines.  It  stands  for  asynchro- 
nous transfer  mode,  which  is  going  to  be  the  next  big  communication  and  wide-area 
networking  standard.  It  runs  at  a  wide  range  of  speeds,  but  it  is  cost-effective  now 
only  for  greater-than-10-megabit/second  lines. 

There  is  also  a  phone-company  acronym  for  an  ordinary  rotary-dial  voice  line.  It  is  pots, 
which  stands  for  plain-old  telephone  service. 


Fighting  Gopliers)  and  joined  the  Internet; 
now  a  typical  month  sees  about  50,000  doc- 
uments transmitted. 

There  are  some  restrictions  on  the  way 
business  may  be  transacted  over  the  net.  These 
are  mainly  due  to  Internet  "culture"  radier 
than  to  laws.  The  Internet  is  not  a  single 
entit)',  but  a  loose  consortium  of  cooperat- 
ing entities,  all  of  which  own  and  flrnd  their 
piece  of  die  web.  It  is  governed  by  an  ethos 
of  courtesy  and  reciprocit)',  not  by  laws  and 
regulations.  Thus,  Rule  One  says  that  all 
network  services  shall  be  passive;  someone 
must  request  a  document  before  it  can  be 
uansmitted.  There  shall  be  no  "junk  E-mail" 
of  advertising  broadsides  to  solicit  business. 

There  are  also  some  issues  of  copyright 
infringement.  The  belief  that  all  information 
ought  to  be  free  still  lingers  in  many  corners 
of  the  Internet,  and  many  an  information 
provider  has  been  appalled  to  find  his  stock 
in  trade  being  openly  posted  on  public-access 
bulletin  boards.  But  in  tact,  the  danger  is  not 
much  different  from  what  a  CD-ROM  pub- 
lisher would  face — once  the  discs  are  out 
there,  the  database  is  exposed.  And  copyright 
law  still  applies  to  Internet  data,  just  as  it 
does  to  CD-ROMS.  As  a  practical  matter,  said 
Duhring,  copiers  may  be  stealing  your  intel- 
lectual property',  but  they  ai-e  also  advertising 
for  you. 

WAIS  likes  to  use  a  retail-store  metaphor 
for  its  information  servers.  The  customer  can 
come  into  the  store  and  look  around,  brows- 
ing dirough  die  available  topics.  Perhaps  you 
(the  merchant)  will  allow  him  to  read  die 
headlines  or  the  absti-acts;  but  at  some  point 
diat  you  have  selected,  you  can  requii-e  a  fee 
before  diwiging  any  more.  Payment  can  be 
by  the  document  (perhaps  a  credit  cai'd  num- 
ber encrypted  by  a  public-key  algoritiim),  by 
annual  subscription  or  any  other  mechanism 
diat  fits  your  business  model. 

Impact  on  mom  and  pop.  Steve  Waters, 
VP  of  the  Rome  (NY)  Sentinel,  spoke  about 


the  implications  of  electronic  delivery  of  in- 
formation. Newspapers  used  to  think  that  it 
would  be  impossible  to  provide  a  true  elec- 
tronic equivalent  of  the  daily  paper,  but  diey 
are  now  changing  their  minds.  iVIost  of  the 
necessary  hardware — computers,  phone  lines, 
sofhvare — is  available  now.  The  only  missing 
piece  is  a  large,  high-resolution  screen,  and 
that  is  likely  to  arrive  soon.  (Knight-Ridder's 
Roger  Fiddler  estimates  that  in  7-10  years 
you'll  ha\'e  a  tabloid-size  screen  costing  $200. 
It  will  be  half  an  inch  tiiick  and  you  will  be 
able  to  take  it  into  die  bathroom.)  The  other 
issue  is  die  publisher's  mindset,  but  publish- 
ers are  starting  to  realize  diat  a  newspaper  is 
just  a  format.  An  electronic  newspaper  can 
also  be  formatted  with  tjpe,  ads,  headlines 
and  die  other  navigational  landmarks. 

Collecting  local  information  and  edit- 
ing it  into  stories  is  what  newspapers  are  re- 
ally all  about.  Writers  may  have  to  learn  to 
write  dual  news  streams — one  for  print,  the 
other  for  the  screen — but  no  other  agency  is 
fitted  to  perform  this  task.  Other  newspaper 
deparmients  wiU  have  similar  maturations;  the 
layout  staff,  for  example,  will  become  adept 
at  placing  hypertext  buttons  on  the  screen. 
Advertising  departments  will  take  on  the 
media-selection  fiinctions  now  done  by  ad 
agencies.  And  the  business  departments  will 
learn  how  to  sell  computing  and  informa- 
tion services. 

ElecU'onic  information  will  raise  other 
problems.  One  will  be  copyright  law,  wliich 
is  far  beliind  the  pace  of  innovation.  The  law 
must  be  updated  to  recognize  separate  uses 
for  information,  including  referential  use 
(hypertext  pointers),  reportorial  use  (now 
covered  by  the  "fair  use"  clause),  advertorial 
use  (commercial  quotation),  anthological  use 
(ftiU-text  inclusion)  and  artistic  use. 

Another  social  problem  may  be  tough- 
er. Hometown  retail  shopping  has  been  the 
economic  rnison  d'etre  for  small  towns  for 
nearly  a  century  Now,  catalogs  and  TV  shop- 
ping net\vorks  are  gradually  eliminating  the 


local  mom-and-pop  stores  and  dius  threat- 
ening die  existence  of  the  towns.  Perhaps  the 
stores  will  be  ti-ansfbrmed  into  cottage-indus- 
try flilfillnient  operations,  executing  die  or- 
ders generated  by  die  advertising  in  die  new 
electi'onic  medium. 

Fortunately,  Waters  said,  publishers 
don't  have  to  predict  die  fixture  very  tar  or 
very  accurately.  They  just  have  to  be  sure  diat 
they  don't  dead-end  in  the  meantime. 

Guarding  your  intellectual 
property 

One  of  die  reasons  for  adding  an  electronic 
outiet  for  your  publications  is  to  provide  richer 
forms  of  content:  sound,  animation  and 
movies.  But  as  soon  as  you  venture  into  diese 
art  forms,  you  are  no  longer  in  the  familiar 
worid  of  print,  and  different  laws  now  gov- 
ern your  rights  and  duties.  The  legal  struc- 
tures of  Tin  Pan  Alley  and  Holl}'\vood,  not 
Fleet  Street,  govern  die  new  media. 

Along  with  the  programmers,  audio 
technicians,  animators  and  other  sidlled  per- 
sonnel, you  should  be  sure  to  have  a  compe- 
tent intellectual-propert)'  lawyer  as  an  early 
member  of  your  team.  His  mission:  to  make 
sure  that  you  actually  ha\'e  die  legal  rights  to 
use  the  sounds  and  images  diat  you  «'ant  to 
use  in  your  multimedia  publication.  As  we 
shall  see,  that's  not  so  simple  a  task. 

It  starts  with  copyright.  Most  of  the 
rights  you  need  have  their  basis  in  die  legal 
concept  of  copyright.  Tliis  includes  die  right 
to  reproduce  and  distribute  a  work,  to  mod- 
if^'  it  or  make  derivative  works  from  it,  and  to 
exhibit  or  perform  die  work  in  public.  There 
ai'e  also  the  well-loiown  laws  of  patents  and 
ti'ademarks. 

However,  all  of  these  laws  vary  from 
counn-y  to  country.  For  example,  copyright 
generally  lasts  for  the  life  of  die  creator  plus 
50  years.  But  in  Germany  it  lasts  for  life  plus 
70  years.  Before  die  Second  World  War,  copy- 
right in  several  nations  lasted  only  for  20  years 
unless  renewed,  and  thus  many  movies  pro- 
duced before  the  Great  Depression  fell  into 
the  public  domain.  But  diis  must  be  verified 
on  a  case-by-case  basis. 

There  is  also  an  entirely  separate  class  of 
personal  rights.  These  include: 
a  Privacy.  The  use  of  people's  images  and 
voices  for  commercial  purposes  is  a  right 
that  must  be  secured  fi'om  the  individuals 
diemselves.  The  privileges  that  apply  to 
news  coverage,  which  essentially  strip  away 
the  privacy  rights  of  politicians  and  celeb- 
rities, tj'pically  don't  extend  to  productions 
for  profit. 

»  Publicity.  People  who  have  built  up  a  pub- 
lic reputation  have  die  right  to  control  how 
odiers  exploit  that  reputation.  For  exam- 


28 


